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1.6 Sequential decision theory. As previously, let the sample space be a measurable 
space (A, B) and V = {Pg, 9 G 0} a family of laws on it. Let Xi, A 2 , . . . , be independent 
and identically distributed with law Pg. Let A be the space of possible (specific) actions, 
with a a-algebra £ . We have a cr-algebra T on and a loss function L which is a measurable 
function L from x A to [0, oo]. A prior 7r may or may not be given on O. 

A sequential decision rule will consist of two functions A and 5 as follows. Let A°° 
be the set of all sequences {x n } n >i with x n E X for all n. For n = 1, 2, . . . , let B n be 
the smallest a-algebra of subsets of X°° for which Xi, . . . , X n are measurable. Let Bq be 
the trivial cr-algebra {</>, A 00 }. Then A is a function from A°° into N := {0, 1,2, ... } 
such that for each k = 0, 1, 2, . . . , {A < k} G Bk- Such a function is called a stopping 
time or stopping rule. The terminal decision rule, 5, is a sequence of functions {S n }o< n<00 
where 5 G A and for each n > 1, 5 n is a measurable function from A n into A. The action 
actually taken will be c>Ar := 5jv(Ai, . . . , Xn). Let := {A(-),c>}. 

If c > is the cost of each observation, the total loss (including costs of observations) 
in a given case is L(6, 5n) + Ac. The risk is then 

r(9,(j>) := c- EN + EL(6,5 N ) 

where the expectations are with respect to Pg° on A°°. Note that if c = 0, and A is 
required to take finite values as in the above definition, then in general, optimal rules do 
not exist. 

Example. This is actually not a statistical decision problem as just formulated but it will 
illustrate some possible difficulties. 

Suppose that a gambler can play a sequence of games as follows. In the nth game, 
the gambler wagers $1 and wins $100-2 n+1 with probability 0.01/2 n . Thus the expected 
gain in each game is $2 - $1 = $1. So the "Bayes" or optimal strategy would seem to 
be to continue playing indefinitely. But the probability that the gambler ever wins is 
< Y^ n > \ 0-0i/2 n = 0.01. If the gambler never wins, which occurs with probability > 0.99, 
then the gambler wagers and loses infinitely many dollars. The expected or average gain 
from games won is also infinite, if the gambler continues to play, so that the overall average 
gain is oo — oo (undefined). There is actually no Bayes (optimal) strategy. 

Let f n be the net winnings after n plays. Then Ef n = n — > +oo while f n — > — oo a.s. 
by the Borel-Cantelli lemma. 

We can also consider sequential randomized decision rules defined as follows. For 
n = 1, 2, . . . , let (A n , £ n ) be a measurable space, where A n is the space of specific actions 
which can be taken after n observations. Often, all (A n ,£ n ) will be equal to one space 
(A,£). Assume that for each n, G A n , where the action "0" will mean taking another 
observation X n +i, while all other actions in A n will imply taking no more observations. 
Each (j) n is a measurable function from (A n , B n ) into the space of all probability laws 
on (A, £). So, given Ai, . . . , X n , if no decision to stop has been made earlier, we then take 
another observation with probability (f> n (Xi, . . . , A n )({0}) and otherwise stop and take an 
action chosen from A n \ {0} with distribution n (Ai, . . . , A n )/(1 — </> n (Ai, . . . , X n )({0})). 
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So for a sequential randomized test of P vs. Q, we can take A n = A = { — 1, 0, 1} for all n, 
where (as in Sec. 1.5) —1 means choosing P and +1 means choosing Q. 

PROBLEMS 

1. In the example at the end of Sec. 1.5 and where Rq/p has only values t or 1/t (not 1) 
with t = 2, let 4> be a randomized test that does SPRT(l/8,2) or SPRT(l/2,8) with 
probability 1/2 each. Compare the performance of this test to SPRT(l/4,4) in terms 
of error probabilities and average sample numbers. 

2. For a sequential test of P vs. Q as in Problem 1, suppose that the nth observation costs 
l/3 n , while Lpq = Lqp = 3 and 7r(P) = n(Q) = 1/2. A decision rule must reach a 
decision after a finite number iV of observations. Is there an optimal (Bayes) sequential 
test in this case? Why, or why not? 
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